Ordinal Data Analysis via Graphical Models
نویسنده
چکیده
Background. Undirected graphical models or Markov random fields (MRFs) are very popular for modeling multivariate probability distributions. A considerable amount of work on MRFs has focused on modeling continuous variables and unordered categorical variables also called as nominal variables. However, data from many real world applications involve ordered categorical variables also called as ordinal variables (e.g., movie ratings on Netflix which can be ordered from 1 to 5 stars). While one can model ordinal variables using models designed for continuous or nominal variables, this can result in incorrect inferences about the variables. While, recent work has designed graphical models for modeling ordinal data, the proposed estimator for learning this model involves optimization of a difficult non-convex problem which is both computationally expensive and doesn’t come with statistical guarantees. Aim. Given multivariate ordinal data, we aim to estimate the joint probability distribution and the conditional dependency structure in the data. To this end, we provide a new estimator for ordinal probit model (a graphical model for ordinal data), that is computationally efficient and which comes with statistical guarantees. Data. We analyze HINTS-FDA dataset, which is a survey on how people access and use smoking and cancer related information and how they perceive risks of smoking. Results. We apply our estimator on HINTS-FDA data to understand the smoking behavior of people and their perceptions of smoking risks. Our analysis suggests that people who smoke, perceive smoking as less harmful than people who don’t smoke and the lack of awareness of smoking risks could be a reason why many people smoke. Conclusion. We have proposed a new estimator for learning ordinal probit model that is computationally tractable and can be easily scaled to large datasets. We empirically corroborated the superior performance of our estimator for the probit model.
منابع مشابه
Simulating from graphical models for ordinal categorical data
Multivariate ordinal categorical data is encountered in many fields of research. For analysis and data reduction the conditional independence properties of these data are studied in graphical models. However, to simulate multivariate ordinal data with a specific conditional independence structure, for use in simulation studies or computer intensive methods of inference, is non-trivial. We prese...
متن کاملBayesian model determination for multivariate ordinal and binary data
We consider how to compare different conditional independence specifications for ordinal categorical data, by calculating a posterior distribution over classes of graphical models. The approach is based on the multivariate ordinal probit model (Chib and Greenberg, 1998) where the data are considered to have arisen as truncated multivariate normal random vectors. By parameterising the precision ...
متن کاملOrdinal Graphical Models: A Tale of Two Approaches
Undirected graphical models or Markov random fields (MRFs) are widely used for modeling multivariate probability distributions. Much of the work on MRFs has focused on continuous variables, and nominal variables (that is, unordered categorical variables). However, data from many real world applications involve ordered categorical variables also known as ordinal variables, e.g., movie ratings on...
متن کاملCopula Gaussian Graphical Models and Their Application to Modeling Functional Disability Data
We propose a comprehensive Bayesian approach for graphical model determination in observational studies that can accommodate binary, ordinal or continuous variables simultaneously. Our new models are called copula Gaussian graphical models (CGGMs) and embed graphical model selection inside a semiparametric Gaussian copula. The domain of applicability of our methods is very broad and encompass m...
متن کاملA survey of Bayesian Data Mining - Part I: Discrete and semi-discrete Data Matrices
This tutorial summarises the use of Bayesian analysis and Bayes factors for nding signi cant properties of discrete (categorical and ordinal) data. It overviews methods for nding dependencies and graphical models, latent variables, robust decision trees and association rules.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017